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FOREWORD 


This  research  and  development  was  conducted  in  response  to  Navy  decision  coordina¬ 
ting  paper  Z1187-PN  (Computer-based  Manpower  Planning  and  Programming),  subproject 
PN.02  (Officer  Personnel  Management  Models)  and  was  sponsored  by  the  Deputy  Chief  of 
Naval  Operations  (Manpower,  Personnel,  and  Training)  (OP-Oi).  The  objective  of  this 
subproject  is  to  develop  a  set  of  user-oriented,  computer-based  models  and  data  bases  to 
assist  in  the  development  of  a  Navy  officer  force  that  meets  the  requirements  for  officer 
manpower. 

During  this  effort,  various  econometric  techniques  were  reviewed  to  estimate  models 
using  pooled  time-series  and  cross-section  data.  The  techniques  discussed  herein  will  be 
evaluated  in  terms  of  their  theoretical  and  practical  utility  for  forecasting  Navy  officer 
personnel  loss  rates. 


3.  W.  RENARD  3 AMES  W.  TWEEDDALE 

Captain,  U.S.  Navy  Technical  Director 

Commanding  Officer 


v 
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SUMMARY 


fet  many  practical  regression  problems,  sufficient  observations  are  net  available  to 
estimate  separate  time-series  or  cross -section  equations.  One  approach  to  tMt  problem  is 
to  combine  the  time-series  and  cross-section  data  into  one  model.  Such  models  have  the 
practical  advantage  of  allowing  the  estimated  coefficients  of  the  combined  sample  to 
Incorporate  both  time-series  and  cross-section  characteristics.  For  example,  the  officer 
retention  forecasting  model  (ORFM)  within  the  structured  accession  planning  system  for 
officers  (STRAP-O)  uses  time-series  (1970-1983)  and  cram-section  (community  by  pay 
grade  by  length  of  service)  data  to  forecast  officer  loss  rates  based  on  a  variety  of 
economic  scenarios.  Time-aeries  observations  allow  the  measurement  of  the  effects  of 
external  influences  (e.g.,  civilian  unemployment)  on  officer  loss  behavior}  and  aroas- 
section  observations,  the  effects  of  internal  influences  (e.g.,  promotion  opportunities). 
The  search  for  an  appropriate  model  for  combining  these  data  provides  the  motivation  for 
this  report. 

Objective 

The  objective  of  this  effort  was  to  review  various  econometric  techniques  used  to 
estimate  and  test  models  that  use  pooled  time-series  and  cross-section  data. 

Approach 

The  literature  relevant  to  the  estimation  and  testing  of  three  such  models  was 
reviewed. 

Results 

In  Model  1,  the  intercept  and  slope  coefficients  are  assumed  to  be  constant  over  all 
cross-sections  and  time  periods.  In  Model  D,  the  intercepts  are  allowed  to  vary  over  both 
cross-sections  and  time  periods.  Model  IU  allows  all  coefficients  to  vary  over  cross- 
sections  and  time.  Moreover,  fat  Models  0  and  ID,  the  coefficients  may  be  viewed  as 
either  fixed  or  random  effects.  In  a  fixed-effects  model,  statistical  inferences  outside 
the  set  of  sample  observations  cannot  be  drawn.  In  a  random -effects  model,  statistical 
inferences  about  the  entire  population  can  be  drawn. 

Conclusions 

Models  I,  Q,  and  HI  are  hierarchical  in  nature  in  the  sense  that  fewer  restrictions  are 
imposed  on  the  model's  structure  as  the  user  moves  from  Model  I  to  Model  m.  This 
enables  the  user  to  move  systematically  down  the  list  of  models  as  he  or  she  discovers, 
either  via  hypothesis  testing  of  fitted  models  or  prior  information,  that  the  bhpaaed 
restrictions  may  be  successively  relaxed.  However,  it  should  be  recognised  that  the 
satisfaction  of  the  assumptions  of  the  model  does  not  validate  or  invalidate  the 
appropriateness  of  the  model  for  the  application  at  hand. 
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INTRODUCTION 


Problem 


Whenever  observations  are  available  for  several  cross-sections  or  individual 1  units 
over  a  period  of  time,  the  cross-sectional  observations  may  be  pooled  with  the  time-series 
observations  under  one  of  several  models.  The  primary  advantage  of  these  models  is  that 
they  allow  the  estimated  coefficients  of  the  combined  sample  to  incorporate  both  time- 
series  and  cross-section  characteristics  within  a  specified  structure.  Another  potential 
advantage  of  this  approach  is  that,  depending  upon  the  model,  pooling  may  result  in  an 
increase  in  the  degrees  of  freedom  of  the  estimated  coefficients  that,  all  other  things 
being  equal,  will  yield  more  efficient  estimates. 

Cross-section  estimates  generally  differ  from  time-series  estimates  made  on  obser¬ 
vations  drawn  from  the  same  population.  In  addition  to  the  empirical  observation  that 
cross-section  variation  is  generally  greater  than  time-series  variation,  cross-sections 
typically  will  reflect  long-run  adjustments,  whereas  time-series  tend  to  reflect  short  (er)- 
run  reactions.  This  is  because  disequilibrium  among  individuals  is  generally  synchronized 
in  response  to  overall  market  forces  so  that  many  of  the  disequilibrium  effects  tend  to 
disappear  in  cross-section  estimates.  A  time  series  of  observations  on  a  particular 
individual  will  exhibit  a  less  completely  adjusted  response  to  the  same  overall  factors. 
Note,  however,  that  cross-section  observations,  in  general,  will  also  contain  some  short- 
run  disturbances  so  that  estimates  based  upon  cross-section  data  will  only  approximate 
fully  adjusted,  long-run  coefficients.  For  further  discussion  of  these  points,  see  Klein 
(1974),  Bass  and  Wittink  (1975)  and,  especially,  Kuh  (1959). 

The  problem  involved  in  using  pooled  time-series  and  cross-section  data  is  to  specify 
a  model  that  can  adequately  allow  for  differences  among  cross-section  units  for  a  given 
time  period,  as  well  as  allow  for  differences  among  time-series  units  for  a  given  cross- 
section.  After  the  model  has  been  specified  and  estimated,  hypotheses  tests  must  be 
conducted  on  the  estimated  parameters  to  help  in  evaluating  the  model's  appropriateness. 
If  the  separate  time-series  and  cross-section  observation  sets  are  sufficiently  large,  it 
may  also  be  advantageous  to  estimate  separate  time-series  and  cross-section  models  to 
provide  a  basis  for  comparison  with  the  pooled  model. 


The  general  linear  model  may  be  written  as 


Y 

where  Y 
X 
8 
u 


XB  +  u, 
^knt1, 

and 


(1) 


The  subscript  n=l,.  .  .,N  refers  to  a  particular  cross-section  unit  or  individual;  t=l,.  .  .,T 
refers  to  a  particular  time  period;  and  k=l,.  .  .,K  refers  to  a  particular  nonstochastic 
independent  variable.  Therefore,  ynt  is  the  observation  on  the  dependent  variable  for 
individual  n  at  time  t.  The  observation  of  independent  variable  k  for  individual  n  at 


‘The  terms  ''individual''  and  "cross-section"  will  be  used  interchangeably  in  this 
paper. 


time  t  is  x^  .  The  stochastic  disturbance  term,  u,  will  be  assumed  to  have  the  standard 

Gauss-Markov  properties;  that  is,  E(u  .)=0  and  E(u  *)=Var(u  J=  a  2.  As  in  standard 

nt  nt  nt  u 

linear  models,  8knt  is  an  unknown  parameter  to  be  estimated. 

Objective 

The  objective  of  this  effort  was  to  review  various  econometric  techniques  used  to 
estimate  and  test  models  that  use  pooled  time-series  and  cross-section  data. 

Background 

In  many  applied  manpower  models,  sufficient  observations  are  not  available  to 
estimate  separate  time-series  or  cross-section  equations.  Quite  often,  researchers  have 
resorted  to  combining  the  time-series  and  cross-section  data  into  one  model.  In  many 
instances,  these  models  have  been  estimated  with  inappropriate  econometric  techniques. 
An  equally  serious  problem  is  the  failure  to  conduct  hypotheses  tests  of  the  underlying 
assumptions  of  the  model  prior  to  estimation.  This  research  provides  a  systematic  review 
of  the  available  techniques  to  estimate  and  test  pooled  time-series  cross-section  models. 

One  example  of  the  use  of  pooled  data  in  manpower  modelling  is  the  officer  retention 
forecasting  module  (ORFM)  within  the  structured  accession  planning  model  for  officers 
(STRAP-O)  (Siegel,  1983).  ORFM  forecasts  Navy  officer  loss  behavior  based  on  a  variety 
of  economic  scenarios.  In  addition  to  increasing  the  number  of  available  observations,  the 
use  of  pooled  data  by  ORFM  provides  the  opportunity  to  incorporate  both  external  and 
internal  variables  into  the  model.  Variations  in  loss  behavior  over  time,  for  example, 
capture  the  effects  of  variations  in  external  variables,  such  as  unemployment  rates. 
Conversely,  variations  in  loss  behavior  within  a  given  year  across  communities,  pay 
grades,  and  length  of  service  (LOS)  capture  the  effect  of  internal  variables,  such  as 
promotion  opportunities. 

The  ORFM  approach  is  actually  a  two-stage  process.  In  stage  I,  a  cost -of -leaving 
(COL)  is  calculated  via  dynamic  programming  techniques  for  each  unrestricted  line  (URL) 
community,  pay  grade,  and  LOS  cell  for  years  1970  through  1983.  COL  is  defined  as  the 
difference  between  the  present  value  of  future  earnings  from  remaining  in  the  Navy  1 
additional  year,  and  then  making  the  "optimal"  stay  or  leave  decision,  and  the  present 
value  of  future  earnings  from  leaving  the  Navy  and  entering  the  civilian  labor  market 
immediately.  Among  the  parameters  required  to  derive  COL  estimates  are  military  basic 
pay,  regular  military  compensation,  civilian  age-earnings  profiles,  military  promotion,  and 
involuntary  separation  probabilities. 

In  stage  II,  voluntary  loss  rates  are  related  via  logistic  regression  analysis  to  the  COL 
estimates  and  other  variables  that  are  hypothesized  to  influence  officer  retention 
behavior.  The  choice  of  model  specification  and  estimation  technique  is  critical  at  this 
stage,  since  ORFM  is  using  pooled  time-series  (1970-1983)  and  cross-section  (community 
by  pay  grade  by  LOS)  estimates  of  the  cost  of  leaving  the  military. 


APPROACH 


The  literature  relevant  to  the  estimation  and  testing  of  three  models  that  use  pooled 
time-series  and  cross-section  data  was  reviewed.  Each  of  these  models  places  certain 
restrictions  upon  the  structure  of  the  Bknt.  Model  I  assumes  that  the  8knt,  the  intercept 

and  slope  coefficients,  are  constant  over  all  individuals  and  all  time  periods.  Model  II  also 


assumes  that  the  slope  coefficients  are  constants  but  allows  the  intercept  coefficients  to 
vary  over  individuals  and  time.  Model  HI  assumes  that  all  coefficients  (3|<nt)  can  vary 

over  both  individuals  and  time.  Note  that  some  further  restrictions  on  the  structure  of 
the  &knt  are  necessary  in  Model  III  since,  without  any  structure,  there  are  KxNxT 

parameters  to  be  estimated  from  NxT  observations. 

Additionally,  in  Models  II  and  III,  the  8knt  may  be  viewed  as  either  fixed  or  random. 

In  the  fixed-effects  case,  it  is  assumed  that  the  sample  of  NxT  observations  is  equivalent 
to  the  population  under  consideration;  that  is,  there  is  no  interest  in  making  inferences 
about  any  set  of  observations  other  than  that  under  consideration.  In  the  random-effects 
case,  there  is  no  interest  in  making  inferences  about  the  population  from  which  the 
observations  are  merely  one  sample.  Hence,  coefficients  are  forced  to  be  random 
variables  with  means  and  variance  structures  that  are  estimated  from  the  observations. 
In  Model  II,  the  fixed-effects  approach  is  known  as  the  covariance  model,  while  the 
random-effects  approach  is  referred  to  as  the  error  components  or  variance  components 
model. 


RESULTS 


Model  I- -All  Coefficients  Constant 


This  model  assumes  that,  in  equation  (1),  gknt  =  0^  ,  so  that  there  is  no  variation  over 

either  individuals  (cross-section  units)  or  time.  This  is  equivalent  to  running  one  large 
pooled  regression  with  NT-K  degrees  of  freedom  resulting.  This  model  is  the  standard 
approach  to  pooling. 


To  test  the  implicit  hypothesis  8knt  =  8k  for  all  k,  n,  and  t,  it  is  necessary  to  allow 

the  hypothesis  to  be  violated  and  perform  a  statistical  test.  The  way  in  which  the 
hypothesis  is  to  be  violated  is  primarily  a  question  of  data  availability.  To  violate  the 
implied  hypothesis,  separate  cross-section  regressions  are  run  for  each  time-period  or 
separate  time-series  regressions  for  each  individual.  Either  or  both,  if  available,  of  these 
sets  of  regression  equations  can  then  be  used  to  perform  a  Chow  (Maddala,  1977)  test  for 
structural  change  of  the  regression  coefficients  between  the  individualized  equations  and 
the  pooled  equation.  Note  that  the  primary  determinant  of  how  the  individualized 
equations  are  estimated  usually  depends  on  whether  one  has  a  small  cross-section  of  a 
long-time  series  or  a  short-time  series  of  a  large  cross-section.  In  the  luxurious  case  of  a 
long-time  series  of  a  large  cross-section,  one  should  perform  two  tests:  one  to  test  for 
stability  across  individuals;  and  the  other,  for  stability  across  time.  In  the  event  that  the 
null  hypothesis  is  rejected,  this  testing  will  help  determine  what  further  model  is 
appropriate. 


/v 

As  an  example  of  testing  for  stability  of  coefficients  across  time,  let  u  be  the 
(NTxl)  vector  of  ordinary  least-squares  (OLS)  residuals  resulting  from  estimation  of 

equation  (1)  under  the  pooled  hypothesis.  Let  ut  he  the  (Nxl)  vector  of  OLS  residuals 

resulting  from  the  estimation  of  a  different  equation  (1)  for  each  time  period,  t  .  Then 
the  statistic  P  may  be  formed  as 


P  = 


(O*  -O)  /  (T-l)K 
Q/  (N-K)T 


9 


3 


P  has  the  Fisher  F  Distribution  with  (T-l)K  and 


where  Q*  =  u  u  and  Q  =  Z  ut  ut* 

t  =  l  1 

(N-K)T  degrees  of  freedom.  It  should  be  noted  that  this  test  assumes  that  all  Gauss- 
Markov  assumptions  hold  for  each  of  the  individualized  regression  equations,  whether  they 
consist  of  a  cross-section  of  time-series  equations  or  a  time  series  of  cross-section 
equations.  Maddaia  (1977)  discussed  this  particular  test,  as  well  as  several  conditional 
tests  that  are  also  available. 

If  it  cannot  be  assumed  that  each  of  the  individualized  regression  equations  meets  all 
of  the  Gauss-Markov  assumptions,  then  generalized  least-squares  may  be  applied  ij  the 
entire  pooled  sample.  Kmenta  (1971)  gives  results  for  a  time  series  of  cross-section 
equations  in  which  the  coefficients  are  the  same  for  all  individuals  and  all  time  periods, 
the  disturbance  vector  for  each  individual  follows  a  first-order  autoregressive  process, 
and  the  disturbances  for  different  individuals  are  heteroskedastic  and  mutually  correlated. 
Note  that  Kmenta's  model  can  be  modified  to  allow  for  an  arbitrary  time-series  process  as 
long  as  it  can  be  identified  and  estimated. 

Model  ll--Constant  Slope,  Variable  Intercept  Coefficients 

The  next  level  of  complexity  of  modelling  the  pooled  data  is  to  allow  the  varying 
intercept  coefficients  to  capture  differences  in  behavior  over  individuals  and  over  time, 
while  holding  the  slope  coefficients  constant  over  individuals  and  time.  The  coefficients 
of  this  model  can  be  written  as 


Bk  ,  Ml 

a  +  Yn  +  6t  »k=1* 


Therefore,  the  entire  model  can  be  written  as 

K 

Ynt  =  a+Yn  +  St+k^2  Xknt  Sk  +  unt  * 
n  —  1 ,.  .  .,N  j  t  —  1 ,»  .  .,T  j 


where  a  +  Yn  +  $t  is  the  intercept  for  the  nth  individual  at  time  t  ,  a  is  the  "mean" 
intercept  for  all  observations,  yn  is  the  difference  from  a  for  the  nth  individual,  and  6t  is 

the  difference  from  a  for  the  t^  time  period.  Note  that  yn  is  common  across  all  time 
periods  and  6t  is  common  across  all  individuals. 

This  discussion  will  closely  follow  that  of  Mundlak  (1978).  The  appropriate  estimation 
procedure  depends  upon  whether  yn  and  are  assumed  to  be  fixed  or  random.  If  they  are 

assumed  to  be  fixed,  equation  (2)  may  be  estimated  as  a  covariance  model.  If  they  are 
assumed  to  be  random,  equation.  (2)  may  be  estimated  as  a  variance  components  or  error 
components  model.  Before  discussing  the  selection  of  fixed  versus  random  effects,  the 
estimation  procedures  for  each  submodel  will  be  discussed  fully  so  that  similarities  will 
become  apparent. 


Note  that  equation  (2)  is  the  most  general  formulation  of  Model  II.  A  restricted 
version  of  this  model  exists  where  either  the  time  or  the  individual  effect  is  assumed  to 
be  absent.  Further  discussion  of  this  restricted  version  of  the  model  may  be  found  in 
Balestra  and  Nerlove  (1966),  Maddala  (1971),  Nerlove  (1971a),  and  Swamy  (1971).  Further 
discussion  of  the  model  in  equation  (2)  may  be  found  in  Mundlak  (1978),  Nerlove  (1971b), 
Swamy  (1971),  Swamy  and  Arora  (1972),  and  Wallace  and  Hussain  (1969). 

Covariance  Model 


When  y  and  6.  are  treated  as  fixed  effects,  one  of  the  y  's  and  one  of  the  6,,'s  are 
n  t  '  n  t 

redundant.  Otherwise,  the  model  represented  by  equation  (2)  is  not  of  full  rank  and, 

hence,  is  not  estimable  via  ordinary  matrix  inversion  algorithms.  The  restrictions  =0 

n  n 

and  £5t=0  need  to  be  imposed  to  maintain  a  model  of  full  rank.  (A  discussion  of 
estimability,  restrictions,  and  alternative  parameterizations  may  be  found  in  Scheffe 

{n  =  Yn  “Y1  for  n=2’*  ‘  *’N  and 
.,T.  Then,  for  the  nth  individual,  the  model  may  be  written  as 


(1959).)  The  model  may  be  reparameterized  so  that  y 
6t  =<St  -6  j  for  t=2, 


6  +  XRn  BR  +  un  ’ 


where  y  is  a  (Txl)  vector  of  observations  on  the  dependent  variable  for  the  nth 
* '  ^  ^ 
individual;  1T  is  a  (Txl)  vector  of  ones;  6  is  a  l(T-l)xl)  vector  of  ;  X_.  is  the 

t  Rn 


1;  ^  is  a  (Txl)  vector  of  ones;  6 

[  Tx(K-l)]  matrix  without  a  constant  term  that  is  the  reduced  version  of  X  ,  the  (TxK  1 

n 

matrix  implicit  in  equation  (I);  8^=  (Bj,*  •  as  in  equation  (2);  u'  =  (unj,.  •  **unj); 

and  the  other  elements  are  conformable.  Noting  that  y  ^  =0,  the  entire  set  of  NT 

observations  may  be  written  as 


y  =  [  1  NT  Z1  Z2  XR] 


where 


+  u, 


(3) 


y'  =  (yr.  •  -yn). 


X  R  (XR1,’  '  ’’XRN^  ’ 

u'=  (u',.  .  .,  uj^)  ,  and 

■*  # 
y  isa((N-l)xl]  vector  of  Yn  . 
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This  is  the  fixed-effects  version  of  Model  II  written  as  equation  (1).  The  assumptions  on 

the  disturbance  term  are  that  E(u)  =0  and  Efuu^rOu  1^  .  Therefore,  via  the  Gauss- 

Markov  Theorem,  the  OLS  estimator  of  the  l(N+T+K-2)xl]  parameter  vector  in  equation 
(3)  is  minimum-variance  for  all  linear  estimators  and  is  unbiased. 

Several  hypotheses  concerning  the  coefficients  may  now  be  tested  using  the  usual 
least-squares  (OLS)  procedures.  Of  particular  interest  is  the  hypothesis  of  whether 
anything  may  be  gained  from  moving  from  Model  1  to  Model  II.  This  is  equivalent  to  the 
#  »  »  # 

hypothesis  that  Y2-"*=Y  2 T=^'  ma^  teste^  us‘ng  the  conventional  F  test 

that  compares  the  restricted,  via  the  hypothesis,  sum  of  squares  with  the  unrestricted  sum 
of  squares.  The  restricted  sum  of  squares  is  available  from  a  fit  of  Model  I  to  the  data. 
The  test  statistic  is  the  statistic  P  given  in  the  section  on  Model  l  above. 

Another  hypothesis  of  interest  is  that  concerning  whether  the  slope  coefficients  are 
constant  over  individuals  and  time.  To  test  this  hypothesis,  the  data  must  be  arrayed  in  a 
fashion  so  that  different  slope  coefficients,  as  well  as  different  intercept  coefficients, 
may  be  estimated.  This  requires  estimating  either  a  time  series  of  cross-sections  or  a 
cross-section  of  time  series  with  the  degrees  of  freedom  constraints  N+K<T  and  T+K<N 
respectively.  Then,  using  the  F  test  comparing  the  restricted  with  the  unrestricted  sums 
of  squares,  the  hypotheses  of  constant  versus  varying  slope  coefficients  over  both 
individuals  and  time,  as  appropriate,  may  be  tested. 

Two  important  problems  are  associated  with  the  use  of  this  covariance  model  for 
pooling.  The  first  is  that  the  dummy  variables  included  in  equation  (3)  for  shifting  the 
intercept  of  the  regression  equation  over  both  time  and  individuals  do  not  directly  identify 
the  variables  that  are  causing  the  shifts.  This  is  a  standard  problem  with  using  dummy 
variables  since  the  dummies  are  functioning  as  proxies  for  variables  that  are  missing  from 
the  model.  Therefore,  dummy  variable  coefficients  are  difficult  to  interpret.  The  reader 
should  recall  that  the  dummies  represent  either  cross-section  or  time-series  differences 
from  the  overall  mean,  a  . 

The  second  problem  involved  with  the  covariance  model  is  that  it  uses  up  a 
substantial  number  of  degrees  of  freedom.  In  Model  I,  K  coefficients  are  estimated  using 
NT  observations  but,  in  this  covariance  model  of  Model  II,  N+T+K-2  coefficients  are 
estimated  using  NT  observations.  An  implicit  assumption  of  the  above  development,  now 
made  explicit,  is  that  N+T+K-2<NT.  Additionally,  the  statistical  quality  of  the  covari¬ 
ance  model  decreases  as  the  number  of  coefficients  approaches  the  number  of 
observations. 


Another  problem  that  is  machine-  and  implementation-dependent  is  that  N+T+K-2 
may  be  too  large  for  inversion.  This  problem  may  be  overcome  by  considering  the 
partitioned  inverse  that  yields  the  OLS  estimator  for  6^  in  equation  (3).  Application  of 

formulas  for  partitioned  inverses  and  simplification  yields: 

3rw  =(X^QXRr‘  (XROy). 

The  (NTxNT)  matrix  Q  is  idempotent  and  is  defined  by 


^'nt* 


XTX  T  j  N_^ N 


1  NT  1  NT 
NT  "  • 


(4) 
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This  matrix  is  a  generalization  of  the  usual  deviation-from-means  matrix  A=I  -  x  fvj 1  jsj/^ 

(see  Theii  (1971).  This  matrix  arises  from  averaging  equation  (2)  over  n,  t,  and  both  n  and 
t  and  subtracting  the  two  single  averages  from  the  sum  of  equation  (2)  and  the  double 
average.  Using  the  obvious  notation,  this  yields: 


-  yn.  -  y.t  ♦  y.. 


=  £  (X: 


knt  “  xkn*  "  ''k-t 


nt  7n*  ■'•t  i<=2 


-  X,..*  -  X,..)  B  +  (u^  -  u  .  -  u  *  -u.  >  • 


nt  n* 


For  this  notation,  as  well  as  a  similar  approach  to  covariance  analysis,  see  Sheffe  (1959). 
In  matrix  formulation,  this  may  be  written  as 


Qy  =  Qxr6rw  +  * 


(5) 


Consequently,  B^  may  be  viewed  as  the  OLS  estimator  obtained  from  equation  (5)  or  the 

GLS  estimator  from  equation  (5)  where  Q  is  the  idempotent  covariance  matrix  of  the 
disturbance  term,  Qu.  Note  that  Q  being  idempotent  is  equivalent  to  it  being  the 

generalized  inverse  of  itself.  The  covariance  matrix  for  Bj^  is  given  by  o*(X  R  Q  XR)  - 
Because  B^^  utilizes  the  variation  of  the  independent  and  dependent  variables  within 

each  individual,  each  time  period,  and  both  individual  and  time  period,  it  is  known  as  the 
"within"  estimator.  As  shown  by  Mundlak  (1978),  this  will  play  an  important  role  in  the 
error  components  model. 

A 

If  B^  is  estimated  using  the  partitioned  model,  the  remaining  parameters  in  the 
model  may  be  estimated  from 

.  _  K  _  A 

a  =  y..  -  £  x.  8kW  , 

k=2  K  KW 


A 


Y 

n 


=  (yn.  -  - 


K 

£ 

k=2 


(x 


kn- 


xk-*  SkW 


and 


3  =  (y 
t  r*t 


K 

)  -  £ 
k=2 


(x, 


•t  '  xk' 


.)  B 


RW 


These  are  standard  results  from  covariance  analysis,  as  shown  in  Scheffe  (1959). 

The  covariance  model  framework  may  also  be  used  to  introduce  problems  of 
heteroskedasticity  and  autocorrelation  among  residuals.  A  recent  extension  of  the 
covariance  model  for  time  series  of  cross-sections,  to  include  arbitrary  intertemporal 
covariances,  has  been  made  by  Kiefer  (1980). 
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Error  Components  or  Variance  Components  Model 


The  error  components  or  variance  component  models  treat  yn  and  i  in  equation  (2) 
as  random  variables  with  n  ' 

E(yn)  =  E(5t)  =  0  ; 

E(YnYm)  =  ayn=m 
0  n^m ; 

E(5,5S)  =  |o*  t=s 

(0  t^s ;  and 

^Yn'V  =  E^Ynunt^  ~  E^tunt^  =  for  n  and  '* 

Therefore,  the  model  can  be  written  for  the  nth  individual  as 
''n  =  Vt  *  >T«*  *n®  *  ui  • 

where  5'  =  (6  ^ .  .,5  j)  and  xn  and  8  include  a  constant  term  and  its  coefficient,  a  .  The 
full  model  with  NT  observations  may  be  written  as 

Y  =  (y  ®  tT)  +  (1n  ®  V6  +  xs  +  u> 

where  y' _  (y  ,  .,yN).  The  covariance  matrix  for  the  individualized  equation  may  be 
written  as 

■A 

2:  =  E  l(Y_  i  _  +  l_6  +  u  )  (y_,  \  T  +  ITS  +  u  )  I 

nm  nTT  n'mTT  m 

ay  lTlT  +  a5  *T  +  au  fT  ’  n=m 
Iy  ,  n^m 

so  that  the  complete  covariance  matrix  for  all  NT  observations  becomes 

2  =  °y  ®  lTlT)  +  a6  ^lNlN^  ®  +  °u  *NT  * 

In  general,  Z  does  not  have  any  simplifiable  structure,  except  as  noted  at  the  beginning  of 
this  section. 
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If  the  three  error  variances,  0*,  o|,  and  0*  ,  are  known,  then  the  GLS  estimator 
for  8, 

8  =  (x"z_1X)“1(X^y)  (6) 


is  the  minimum-variance,  unbiased  linear  estimator.  As  is  shown  by  Mundlak  (1978)  and 


others,  if  8  is  partitioned  as  I  £  J ,  it  is  possible  to  show  that  8n  is  a  matrix  weighted 

\  °RE  /  R 


average  of  three  other  estimators.  This  partitioning  results  from 


j-i  -s  .  ii_  .  Si 

°2  °3 


where  Q  is  defined  in  equation  (4)  above, 


l  *»•  l  T* 

Q,  -  (lN®  >  - 


xnt'nt 


-  N 1 N  '  ®  1T)  -  1  NT1  MT  , 

q2  =  ( - N.  w  T  -w 

Q3="'  =  *NT  -  ^-0,-^2  ’ 


°1-ToY+0u  ■ 


0 7  =  N  0^  +  o2  ♦  and 


a,  =  T  o*  +  N0?  +  02  =  0Z  +  02  -  02  . 

3  y  0  u  j  2  u 

Partitioning  equation  (6)  in  the  manner  noted  above  and  using  equation  (7)  yields 


x  wv* .  xWr  *  xr^xr 

KC  *  *  9 


SRW  ’ 


<1  =  y..  -  z  x  n 
k  -2  K 
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where  BRW  is  the  "within”  estimator  from  the  covariance  model  above.  The  other  two 
estimators  come  from  OLS  applied  to  averaging  equation  (2)  over  individuals  and  time 

respectively.  To  see  this  for  8R  >  note  that 

K  _  _ 

V  =  a  +  yn  4  kZ_2  xkn»  8k  +  V 
is  equivalent  to 

Qiy  =  QixReR  +  Qiu 

so  that  8R  =  (XrQ^^'1  (X^Qjy)  . 

Similarly,  BR  =  (xr^2XR)_1  (XRQ2y**  Consequently,  8R  is  an  efficient,  matrix- 
weighted  average  of  the  three  estimators:  (1)  BR  ,  which  is  based  upon  the  variation  over 

individuals,  (2)  8^  ,  which  is  based  upon  the  variation  over  time,  and  (3)  BRW  ,  which  is 

based  upon  variation  not  explained  by  differences  over  individuals  or  over  time.  For 
further  details  of  calculations  and  alternative  transformations,  see  Fuller  and  Battese 
(1974),  Mundlak  (1978),  Nerlove  (1971b),  and  Swamy  and  Arora  (1972). 

In  most  cases,  the  error  components,  °*  *  °y  and  °|  »  are  001  known.  The 
following  estimators,  suggested  by  Swamy  and  Arora  (1972),  are  unbiased: 

°i  =  Gj/n-k  , 

o22  =G' G2/T-K  ,and 

o*  =  G'G/  ((  N-l)  <T-l)  (K-l)l  , 

where  Gj  =  Qjy  -  QjXr8r  -V^y-  ^2XR8R’  and  “  =  Qy  'QXR8RW  are  the  residua,s 

from  each  of  the  above  estimators.  Alternative  estimators  have  been  suggested  by 
Amemiya  (1971),  Fuller  and  Battese  (1973,  1974),  Maddala  (1971),  Nerlove  (1971a),  Rao 
(1972),  Swamy  (1971),  and  Wallace  and  Hussain  (1969). 

When  X  contains  lagged  values  of  the  dependent  variable,  several  problems  are 
introduced.  The  most  significant  are  that  the  parameters  may  not  be  identified  and  that 
the  error  components  estimators  are  no  longer  unbiased.  These  problems,  as  well  as  some 
suggested  corrections,  are  described  in  Berzec  (1979),  Maddala  (1971),  Nerlove  (1967, 
1971a),  and  Swamy  (1974). 
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Several  hypotheses  are  available  for  testing  after  the  model  has  been  estimated. 
Perhaps  the  most  important  is  the  hypothesis  of  whether  anything  has  been  gained  by 
moving  from  Model  I  to  Model  II.  This  is  equivalent  to  testing  the  vectors  y=0  and  6=0, 

which  is  equivalent  to  o*  =  o|  =  0.  Under  the  null  hypothesis,  the  individual  components 

do  not  exist,  so  that  the  OLS  estimator  of  Model  I  is  minimum-variance  unbiased.  To 
carry  out  the  test,  the  estimator  from  the  covariance  model  may  be  compared  with  the 
OLS  estimator  from  Model  I  via  the  F  test,  which  compares  the  restricted  and 
unrestricted  sums  of  squares.  Details  are  given  in  the  section  on  the  covariance  model 

above.  An  asymptotic  test  ofo*  =  o|  =  0  based  upon  OLS  residuals  from  the  regression 

of  y  on  X  is  available  from  Breush  and  Pagan  (1980). 

Another  test  of  interest  is  whether  the  slope  coefficients  are  equal  over  the  cross- 
section  and  time-series  subsamples.  As  pointed  out  by  Maddala  (1977),  these  tests  are 
asymptotic  when  performed  within  the  framework  of  an  error  components  model.  The 
test  is  again  the  standard  F-test  using  restricted  and  unrestricted  sums  of  squares,  where 
the  restriction  is  that  the  slope  coefficients  must  be  constant  over  individuals  and  time. 
Further  details  are  given  in  the  section  on  the  covariance  model  above. 

Attempts  to  generalize  the  error  components  model  in  the  directions  of  a  Bayesian 
analysis  and  an  errors  in  variables  approach  have  been  suggested  by  Swamy  and  Mehta 
(1973)  and  Chamberlain  and  Grilichas  (1975)  respectively. 

Conclusions  for  Model  II 


As  noted  above,  the  fixed  effects  assumption  of  the  covariance  model  implies  that  it 
is  desirable  to  make  inferences  concerning  only  the  sample  at  hand.  The  random  effects 
assumption  of  the  error  components  model,  however,  assumes  that  the  yn  and  6t  are 

random  variables  and  that  the  individuals  and  time  periods  can  be  regarded  as  some 
random  samples  from  larger  populations.  The  desire  in  the  error  components  model  is  to 
make  inferences  about  the  larger  populations.  Additionally,  the  error  components  model 

assumes  that  there  is  no  correlation  between  y  and  X  or  between  6*  and  X*. 

'n  n  t  t 

Mundlak  (1978)  states  that,  in  both  the  fixed-effects  and  random-effects  cases,  yn 

and  6t  may  be  considered  as  random  but,  in  the  fixed-effects  case,  inference  about  the 

sample  is  conditional  upon  the  values  of  y^  and  6t  observed  in  the  sample.  On  the  other 

hand,  in  the  random-effects  case,  specific  distributional  assumptions  are  made  about  yn 

and  6t,  so  that  unconditional  inference  is  appropriate.  Because  no  specific  distributional 

assumptions  are  made  on  y^  and  5t  in  the  covariance  model,  it  can  be  used  for  a 

(conceivably)  wider  class  of  problems.  If,  however,  the  distributional  assumptions  of  the 
error  components  model  are  correct,  a  more  efficient  estimator  is  obtained  than  that 
available  from  the  covariance  model.  Mundlak  (1978)  points  out,  however,  that  it  may  be 
preferable  to  use  an  estimator  that  possesses  bias  but  that  has  a  lower  mean  square  error 
than  an  available  unbiased  estimator. 
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It  should  be  noted  here  that  the  primary  interest  in  both  the  covariance  model  and 
the  error  components  model  is  in  obtaining  good  estimates  of  the  slope  coefficients. 
Using  the  Mundlak  (1978)  interpretation,  in  the  covariance  model  the  intercepts  are 
random  but  are  conditional  on  the  sample  values  of  the  pure  individual  and  pure  time 
effects.  They  become  the  coefficients  of  dummy  variables  simply  because  the  true  way 
that  individual  and  time  effects  enter  the  model  is  not  known.  In  the  error  components 
model,  the  effects  are  treated  as  unconditionally  random  with  assumed  distributions  but, 
again,  they  function  as  proxies  for  unknown  cross-section  and  time  series  factors. 
Perhaps  a  skeptical  interpretation  of  Model  II  is  that,  although  it  is  desirable  to  account 
for  the  time-series  and  cross-section  effects,  it  is  difficult  or  impossible  to  model  the 
process  adequately. 

An  additional  point,  made  by  Maddala  (1977),  is  that  systematic,  as  opposed  to 
random,  variation  in  the  intercepts  implies  that  the  error  components  model  is  not 
appropriate.  He  also  states  that  it  is  important  to  check  whether  there  is  a  systematic 
pattern  in  the  residuals.  This  should  reveal  whether  the  residuals  are  heteroskedastic  or 
autocorrelated  and,  hence,  what  model  is  appropriate. 


A  final  point  is  that,  since  the  error  components  model  assumes  no  correlation 
between  the  effects  and  the  X^  t  ,  this  should  be  checked  and  tested.  The  null  hypothesis 

is  that  there  is  no  asymptotic  correlation  and  an  asymptotic  test  can  be  carried  out  via  a 

comparison  of  the  estimator  from  the  covariance  model,  BR^>  and  the  estimator  from  the 

error  components  model,  8RE-  The  test  statistic,  which  is  proposed  by  Hausman  (1978) 

for  use  in  the  errors  in  variables  problem,  is 

*-l 


m  =  (8r  -  8REK  D" 


(sD  -  sRE> , 


where  D  is  the  difference  between  the  estimated  covariance  matrices  for  §R  and  6RE. 
Note  that  m  is  asymptotically  distributed  as  a  x*  (K-l)  variable. 


Model  III— Variable  Slope  Coefficients 

The  final  logical  level  of  complexity  of  modelling  the  pooling  process  is  to  allow  all 
the  coefficients  to  reflect  differences  in  behavior  over  individuals  and  over  time.  It  is  no 
longer  necessary  to  treat  intercepts  differently  from  slopes,  so  that  the  general  model  can 
be  written  as 


y  =  X8  +  u  , 

as  in  the  introduction,  or  as 


K 

y»=  !!  *,  .8,  .  t  u  .  , 

nt  knt  knt  nt 


(8) 


in  the  two  previous  sections.  In  the  most  general  case  shown  in  equation  (8),  there  is  an 
immediate  difficulty.  Since  there  are  KNT  parameters  to  be  estimated  from  NT 
observations,  some  further  structure  must  be  placed  on  the  coefficients.  The  usual 
structure  is  to  set 


8knt  =  ak+^kn  +  ,Skt* 


(9) 
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where  8^  represents  some  mean  effect  over  all  individuals  and  time  periods, 

represents  the  effects  due  to  specific  individuals,  and  6^  represents  the  effects  due  to 

specific  time  periods.  Note  that  now  the  individual  and  time  effects,  as  well  as  the  mean 
effect,  are  specific  to  individual  columns  of  X. 

As  in  Model  II,  Ykn  and  6|<t  may  he  assumed  to  be  either  fixed  or  random  and,  again, 

this  is  how  the  various  approaches  to  estimation  of  Model  III  will  be  dichotomized. 
However,  in  this  case,  there  are  ways  of  estimating  the  fixed  effects,  all  of  which  are 
extensions  of  the  covariance  model  presented  above.  The  following  discussion  of  both 
types  of  Model  III  will  closely  follow  Hsiao  (1974,  1975). 

The  version  of  Model  III  represented  by  equations  (8)  and  (9)  is  the  most  general 
formulation  of  the  varying  slopes  hypothesis.  As  in  Model  II,  restricted  versions  of  the 
formulation  are  available  in  which  either  the  time  or  individual  effects  are  absent.  For 
the  fixed-effects  case,  these  will  be  treated  below  since  the  assumptions  are  crucial  to 
the  estimation  procedures.  In  the  random-effects  case,  further  discussion  of  the 
restricted  version  may  be  found  in  Rosenberg  (1973a)  and  Swamy  (1970,  1971,  1973,  1974). 
Further  discussion  of  the  general  formulation  is  found  in  Hsaio  (1974,  1975). 

Fixed  Effects 


When  Yj<n  and  <5^  an*  coated  as  fixed  parmeters,  an  immediate  useful  simplification 

is  to  permit  either  ykn  or  5kt  to  disappear  so  that  a  cross-section  of  time-series  equations 

or  a  time  series  of  cross-section  equations  respectively  is  produced.  Additionally,  the 
coefficients  can  be  defined  as  Bkt  or  8kn  respectively,  so  as  to  incorporate  the  time 

effects  or  individual  effects  into  the  mean  effects  of  the  coefficients  of  the  explanatory 
variables.  Once  in  this  form,  Zellner's  (1962)  seemingly  unrelated  regressions  model  (or 
joint  GLS)  can  be  applied  to  allow  different  time-series  equations  in  each  cross-section  or 
different  cross-section  equations  in  each  time  series.  The  joint  estimation  can  be  carried 
out  for  separate  8kn  or  3kt»  os  well  as  common  Bk»  and  the  hypothesis  of  equal  Bk  may  be 

tested.  Details  are  contained  in  Zellner  (1962).  One  should  check  that  the  assumptions  of 
the  Zellner  model  hold  for  a  particular  application  prior  to  estimation  and  testing. 
Estimation  techniques  for  differing  assumptions  on  error  structures  are  contained  in 
Judge,  Griffiths,  Hill,  and  Lee  (1980). 


The  full  model  of  equations  (8)  and  (9)  can  also  be  viewed  as  a  covariance  model  in 
the  case  where  yk  and  at  c  assumed  to  be  fixed  effects.  In  the  matrix  version  of  the 

model,  the  matrix  of  explanatory  variables  has  the  dimension  NT  x  (T  +  N  ♦  1)K  while  its 
rank  is  only  (T  +  N  -  1)K,  so  that  2K  independent  linear  restrictions  must  be  imposed  on 


the  Ykn  and  Skt. 


N  T 

Hsaio  (1975)  suggests  the  natural  restrictions  Z  y  -  £  6  =  0  as  the 

n=l  n  t=l  1 


2K  restrictions  necessary.  Again,  standard  F  tests  of  comparing  restricted  with 
unrestricted  sums  of  squares  may  be  carried  out  here,  just  as  in  the  case  of  Model  II. 
Note  that  the  degrees  of  freedom  constraint  for  this  full  version  of  the  covariance  model 
is  NT>(T+N-1)K  and  this  is  likely  to  be  violated,  except  in  the  case  of  large  N  and  large  T. 
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Random  Effects 


Given  the  model  of  equations  (8)  and  (9),  which  can  be  written 
K 

^nt  “  ^  j  ^ak  +  \n  +  \t^Xknt  +  unt  , 

the  model  for  the  nth  individual  can  be  written  as 

y  =  X  a  ♦  X  Y  +  zS  +  u  , 

'n  n  nn  n  +  n  ’ 

where  yn  is  (Txl),  Xn  is  (TxK),  Yn%  (Y,n>.  .  .,Ykn) , 

6  ~  (6r*-6T)»  6t  =  ^lt»*  •  *»6kt>*  un  =  <unl»*  *  **unt*  * 


(10) 


2n  = 


(TxTK) 


‘nl 


nt 


and  Xnt^  =  ^Xlnt’’  ‘  *,Xknt^  Hsiao's  1975)  assumptions  are  that 


E 

lunl  =0=E(Ynl,  E 

6t  =0, 

E 

I  u  u-1  = 

0*1 

for  n-m 

n  m 

u 

0 

for  n^m  , 

A 

for  n=m 

E 

(Y  Y'l  =  1 

n  m  | 

o 

for  n^m  ,  and 

l 

1  B 

for  s=t 

E 

1 

1  0 

for  s/t  . 

Additionally,  he  assumes  Yn»  $t,  and  un  are  all  uncorrelated  and  that  A  and  B  are  diagonal 
with  elements  ak  and  bk  respectively. 

Rewriting  (10)  to  include  all  NT  observations  gives 

y  =  Xa  ♦  ZY  ♦  26  +  u  , 

where  Z  is  block  diagonal  with  Xn  as  the  nth  diagonal  block,  ( Z '  =  Z',.  . .  ,Z and  the 
other  vectors  and  matrices  are  stacked  in  the  natural  way. 
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Given  the  assumptions  above,  as  well  as  the  assumption  that  y  and  5  are  random,  the 
covariance  matrix  for  the  complete  error  term  is 


n  =  E  [Zy  -  26  *  nUriy  ♦  Z6+u)'l=  Z0N®A)Z'+Z  (IT(£>B)Z'  +  u*INT- 

Then  the  GLS  estimator  a  =  (X>fi”*X)“^(X'*Q”ly)  is  the  minimum-variance  unbiased  linear 
estimator  for  a  and  has  the  covariance  matrix  (X"fl~*X)"*  . 

Unless  A,  B  and  a*  are  known,  they  will  have  to  be  estimated  from  the  sample.  Hsiao 

gives  two  estimation  techniques  for  the  unknown  parameters  of  fi.  One  is  a  direct 
maximum  likelihood  approach,  while  the  other  is  an  indirect  minimum  norm  quadratic 
unbiased  estimator  (MINQUE)  approach.  Since  the  estimation  procedures  are  both  rather 
complicated,  the  reader  is  referred  to  Hsiao  (1974,  1975)  for  details.  Hypotheses 
concerning  constancy  of  coefficients  over  time  may  be  tested  via  procedures  given  in 
Hsiao  (1974). 

Conclusions  for  Model  111 


As  in  Model  II,  there  is  a  choice  of  whether  to  assume  the  effects  are  fixed  or 
random.  The  conclusions  given  previously  concerning  Model  II  carry  over  completely  in 
the  case  of  Model  III.  Again,  the  most  important  consideration  is  correlation  of  the 
random  coefficients  with  the  explanatory  variables.  If  this  correlation  is  present,  then 
the  GLS  estimator  for  the  random  effects  case  is  biased,  so  it  is  probably  better  to  use 
the  fixed-effect  model.  If,  however,  such  correlation  is  not  present  and  the  distributional 
assumptions  of  the  random-effects  model  are  reasonable,  this  model  will  provide  more 
efficient  estimates  than  the  fixed-effects  model.  Also,  systematic  variation  of  the 
individual  time  effects,  as  opposed  to  random  variation,  is  an  indication  that  the  fixed- 
effects  model  is  preferable. 

Extensions  of  Model  III  in  various  directions  have  been  suggested  by  Singh  and  UUah 
(1974),  Swamy  (1974),  and  Swamy  and  Mehta  (1975,  1977). 


CONCLUSIONS 

The  random-effects  models  above,  when  combined  with  the  coefficients  of  the 
explanatory  variables,  imply  the  mixed  model  of  the  classical  analysis  of  variance 
literature.  Additionally,  the  random-effects  models  can  be  viewed  as  an  intermediate 
step  between  the  OLS  model  where  all  the  coefficients  are  the  same  for  all  individuals 
and  time  periods  and  the  covariance  models  where  the  coefficients  are  different  for  all 
individuals  and  time  periods. 

A  point  that  should  be  heavily  stressed  when  making  choices  over  pooling  models  is 
the  importance  of  the  sample  sizes  for  both  the  cross-section  and  the  time-series  samples. 
These  sample  sizes  determine  both  how  the  data  can  or  cannot  be  grouped  and  place 
constraints  on  the  choice  of  particular  estimation  methods.  Specifically,  estimators  of  the 
random-effects  models  maintain  their  properties  only  when  a  reasonably  large  sample  is 
available. 
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One  model,  not  seen  in  the  pooling  literature,  is  one  in  which  the  slope  coefficients 
vary  over  individuals  and  time  periods  but  the  intercept  coefficients  are  constant.  This  is 
a  variation  of  Model  II  that  can  be  handled  by  a  minor  extension  of  Model  III.  It  is  not 
clear  that  this  model  would  have  any  applications  in  economics. 

The  organization  of  the  three  models  discussed  in  this  report  is  hierarchical  in  nature. 
This  enables  the  user  to  move  down  the  list  of  models  as  he  or  she  discovers,  via  either 
hypothesis  testing  of  fitted  models  or  prior  information,  that  imposed  restrictions  must  be 
successively  relaxed.  It  should  be  noted  that  Models  II  and  III  imply  that  XB  does  not 
account  for  all  the  modelling  variation  in  y.  In  the  random -effects  case,  this  remaining 
variation  is  modeled  by  classifying  its  distribution;  in  the  fixed-effects  case,  it  is  modeled 
via  dummy  variables.  A  third  alternative  that  exists  in  this  modelling  scheme  is  to  allow 
random  variation  of  the  individual  and  time  components  but  place  it  within  a  structural 
framework.  This  leads  into  the  varying  parameter  literature.  Note  that  this  is  the  next 
logical  step  from  Model  III  above.  Some  applications  of  varying  parameter  models  to 
pooling  problems  are  considered  by  Johnson  and  Rausser  (1975),  Rosenberg  (1973b), 
Saxonhouse  (1977),  and  Greenwood,  Ladman,  and  Siege)  (1981).  The  literature  on  varying 
parameter  models  is  voluminous  and  still  growing.  Good  overviews  are  available  in  Judge, 
Griffiths,  Hill,  and  Lee  (1980)  and  Maddala  (1977). 

It  is  important  to  remember  that,  after  pooling  problems  have  been  dealt  with,  the 
application  of  standard  regression  tools  to  the  pooled  problem  at  hand  should  not  be 
forgotton.  An  excellent  example  of  this  is  provided  by  Izan  (1980). 
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